Sort process

ABSTRACT

IN THE SORT PROCESS DISCLOSED HEREIN, THERE IS FIRST SELECTED A RANDOM SAMPLE OF THE RECORDS OF A FILE TO BE SORTED INTO A ORDERED SEQUENCE. THIS SAMPLE MAY SUITABLY HAVE THE SIZE SL+S-1 WHEREIN L+1 IS THE QUANTITY OF SUBSETS DESIRED FROM A PARTICULAR DISTRIBUTION PASS AND S IS A SELECTABLE PARAMETER. THE SELECTED SAMPLE IS SORTED INTO AN ORDERED SEQUENCE AND THE FILE IS THEN PARTIONED IN ACCORDANCE WITH EVERY STH KEY OF THE SORTED SAMPLE INTO L+1 SUBSETS. THE RECORDS OF EACH OF THE SUBSETS ARE AGAIN PARTITIONED INTO L+1 SUBSETS AS DESCRIBED ABOVE AND THIS PROCESS IS CONTINUED UNTIL THE SIZE OF THE SUBSETS IS SMALL ENOUGH SO THAT THEY CAN BE CONVENIENTLY SORTED INTO RESPECTIVE ORDERED SEQUENCES, PREFERABLY EMPLOYING A TREE TYPE SORT. THE ORDERED SEQUENCES ARE THEN CONCATENATED TO FORM THE SORTED FILE. THE RANDOM SAMPLE   MAY BE PROVIDED, FOR EXAMPLE, BY USING A RANDOM NUMBER GENERATOR TO GENERATE INTEGERS IN THE RANGE OF 1 TO N WHEREIN N IS THE TOTAL QUANTITY OF RECORDS IN THE FILE UNTIL SL+S-1 DISTINCT INTEGERS HAVE BEEN GENERATED. THE RECORDS AT THESE INTEGER ADDRESSES IN THE FILE CAN THEN BE SELECTED TO CONSTITUTE THE SAMPLE.

DEFENSIVE PUBLICATION UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice or Dec. 16, 1969, 869 0.6%. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained in the application as originally filed. The files of these applications are available to the public-for inspection and reproduction may be purchased for 30 cents a sheet.

Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Ofiice makes no assertion as to the novelty of. the disclosed subject matter;

PUBLISHED AUGUST 14, 1973 T913,007 SORT PROCESS Archie Charles McKellar, Mount Kisco, N.Y., assignor to International Business Machines Corporation, Armonk, N.Y.

Continuation of application Ser. No. 214,200, Dec. 30, 1971. This application Feb. 20, 1973, Ser. No. 333,920 Int. Cl. G06f 7/06, 7/22 US. Cl. 444-1 8 Sheets Drawing. 25 Pages Specification In the sort process disclosed herein, there is first selected a random sample of the records of a file to be sorted into an ordered sequence. This sample may suitably have the size sl+s1 wherein 1+1 is the quantity of subsets desired from a particular distribution pass and s is a selectable parameter. The selected sample is sorted into an ordered sequence and the file is then partitioned in accordance with every sth key of the sorted sample into l+1 subsets. The records of each of the subsets are again partitioned into [+1 subsets as described above and this process is continued until the size of the subsets is small enough so that they can be conveniently sorted into respective ordered sequences, preferably employing a tree type sort. The ordered sequences are then concatenated to form the sorted file. The random sample may be provided, for example, by using a random number generator to generate integers in the range of 1 to 21 wherein n is the total quantity of records in the file until sl+sl distinct integers have been generated. The records at these integer addresses in the file can then be selected to constitute the sample.

THIS PROGRAM WILL SORT A FILE CONSISTING 0F RECORDS XH), ",Xinl

snrcr A mum some 0F NR SIZE sSHs-i FROM THE ms SORT THE some TO OBTAIN N" Y(1),---,Y(si+sn PARTITION. ms FILE mro 2+1 -suasrrs s ---,s vmrm: -16 s -QfljhliislsXiJkliiiHlsl} son s; in N N nus menu 22 24 commune s; WITH s w-s YES 0 Aug. 14, 1973 A. c. MCKELLAR SORT PROCESS 8 Sheets-Sheet 1 Original Filed Dec. 30, 1971 MAXIMUM KEY VALUE SORT PROCESS Original Filed Dec. 30. 1971 FIG. 2

8 Sheets-Sheet 2 THIS PROGRAM WILL SORT A FILE {0 CONSISTING 0F RECORDS N XH), X(n) 7 SELECT A RANDOM SAMPLE 0F SIZE sR+s-i FROM THE FILE SORT THE SAMPLE TO OBTAIN {4 Hi), ,Y(sR+s-L) V PARTITION. THE FILE mm Q+1 SUBSETS s s WHERE czslzE 0F s 20 no YES Si BY 22 sum s -24 THIS PROGRAM CONCATENATE s; WITH s 5, -s

YES 0 Aug. 14, 1973 A. c M KELLAR T913307 SORT PROCESS Original Filed Dec. 30, 1971 8 Sheets-Sheet 3 FIG. 3A

A00REssi'2345e1a910H RECORD 4 9172413 6 25 2 i8 443 FIG. 3B

ADDRESS123456789IOH RECORD 2 3 4 6 9 i3 417 i8 2425 FILE SUBSET s suasn s SUB'SET s suassr s 2 3 We #9 "3, 14 .m 8 24, 25

FILE

sussn s SUBSET s, SUBSET s SUBSET s; 305.6 G S256 S356 SUBSET s SUBSET s SUBSET s SUBSET s w 11 56 i2 56 43 $9 Fl (3. 6 o 10 1: 12 13 2 3 Aug. 14,1973

Original Filed Dec. 50, 1971 8 Sheets-3n0et a FIG. 7 1+1 HIP-,2 NQ mac 32 NO YES SORT BUCKET MI) 44 SELECT A RANDOM 0F LEVEL (1) SAMPLE 7 SORT THE SAMPLE YES NO 48 V K(I)-K(I)+1 PARTITION THE FILE 40 COUNTING THE OUANTITY 0F RECORDS WHICH so INTO EACH BUCKET 1-1-1 no YES END Aug. 14,1973

A. C. M KELLAR SORT PROCESS Original Filed Dec. 30, 1971 FIG. 8

PUT RECORDS 25);

PUT RECORDS Y(2S),--

--,Y(S1) mro s 8 Sheets-Sheet 5 V 0 (I+"S"1 N (IH) N2 (I+i) s NR (I+i) -S .4m RECORD FROM N64 INPUT ms sun or FILE N66 NO YES rmo i sucu THAT PUT T mo s; 72

END

Aug. 14, 1973 A. c. MOKELLAR SORT PROCESS 8 Sheets-Sheet 7 Original Filed Dec 30, 1971 FIG. iOA

ADDRESS 2 RECORD [4 e n 24 i3 6 2s 2 1a 143 FIG. I08

ADDRESS RECORD CYL I N DERS FILLED AREA :l UNFILLED AREA Aug. 14, 1973 A. c. MCKELLAR T9l3,007

SORT PROCESS Original Filed Dec. 30, 1971 8 Sheets-Sheet 5. FIG. H PMERH TdSb fi Y(1),-'-,Y(n)

LOWER mm UPPER n SELECTjSUCH mm THAT isjsn i T*Y(j) 43 Y(j) -YH) Y(UPPER) T YES NO 14+ l E mowER) 1 Y(L0WER) Y(UPPER) UPPERUPPER 4 NO YES 1 M0 150 +46 I 142 r v LOIER LOWER+i Y(UPPER) Y(LOWER) UPPER=LOWER 2 J M4 YES NO UPPER=LOWER 2 no LYES n-UPPER LOWER-i 2 J YES NO I I SORT n ,i SORT Ym- ---Y(LowER-n 1 BY nus PROCEDURE BY nus PROCEDURE 7 I SORT m); -,Y(LOWER-1) 58 SORT (Y(UPPER+i),---,Y(n) 1,

BY nus PROCEDURE BY nus PROCEDURE 

